Bounds for the discrete correlation of infinite sequences 
on k symbols and generalized Rudin-Shapiro sequences 
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1 Introduction 

Pseudorandom sequences, i.e., deterministic sequences on finite alphabets with properties 
reminiscent of random sequences, are an intensively studied subject. We refer to the series of 
papers by Mauduit, Sarkozy and coauthors [H IH El US [I3] among many others. A great part 
of the mentioned work deals with correlation measures for binary sequences and the problem 
to find large classes of finite pseudorandom binary sequences with small autocorrelation. Let 
X = XqXi ■ ■ ■ xjv £ 1}^ be a finite word over the alphabet {—1, 1}. Then the correlation 
measure of order m of x is defined as 



[xj = max 

M,r 



M 

E 

n=0 



ry ry ... Hf* 



n+r-r, 



'1.1] 



where the maximum is taken over all r = (ri,r2. 



i) with < ri < r2 < ■ ■ ■ < Tm and 



M such that M + r„ 
is defined as 



< A^. In case of infinite words the correlation of order m 



M 



H-ri ^n+r2 



• Xr 



1.21 



n=0 



with fixed r. In contrast to Um{x), this definition does not take "large-range correlations" 
into account. In fact, could be Q{N) for the finite word correlation p,2j. Recently, 
Mauduit and Sarkozy [H] generahzed several measures for pseudorandomness to finite se- 
quences over fc-letter alphabets. These distribution measures have been studied by Berczi [3] 
from a probabilistic point of view. 

The aim of the present paper is to study the discrete correlation among members of 
arbitrary infinite sequences over k symbols, where we just take into account whether two 
symbols are identical. In the sequel, we denote by N the set of non-negative integers, and we 
assume that sums start with index (empty sums are supposed to be zero), unless otherwise 
stated. We further denote by n mod k the unique integer n' with < n' < k — 1 and n = n' 
(mod k). We use "word" and "sequence" interchangeably. 



1 



Let X = XqXi ■ ■ ■ be an infinite word over an alphabet of size k. Witliout loss of generality 
we may assume that Xi G {0, 1, . . . , /c — 1} for i G N. For vectors (^i, • • • , im) with integers 
(1 < J < fn) satisfying < ii < ^2 < ■ ■ ■ < "im, define the discrete correlation coefficient 
6{ii,i2, . . . , im) of order m by 

■ • N 1 0' if Xi^ = Xi2 = ■ ■ ■ = Xi^; 
"^'^''"'■■■'''"^ \ 1, otherwise. 

Moreover, define Cr for all fixed r = (ri, r2, . . . , r^) with < ri < r2 < ■ ■ ■ < by 

Cr = liminf 4; S(n + ri, n + r2, . . . , n + r^). (1.3) 

Af^oo JM ^ — ^ 
n<N 

It is important to remark that for a random sequence (where every symbol is indepen- 
dently chosen with probability 1/k) the quantity Cr equals 1 — l//c"*~^ with probability one. 
In this paper we investigate sequences with respect to this leading term. We first show by 
combinatorial means that for any infinite sequence on k symbols the quantity Cr cannot be 
too large for all r (Theorem 12. 3p . Our result, however, does not rule out the existence of 
deterministic sequences that actually attain our bound. We provide such a construction in 
the case of m = 2 by introducing generalized Rudin-Shapiro sequences on k symbols, which 
extends a construction by Queffelec \X5\ and H0holdt, Jensen and Justesen [3, [8]. The mo- 
tivation stems from the fact that the autocorrelation C(^ri,r2) of the infinite Rudin-Shapiro 
sequence on two symbols is small fl3{ Theorem 4]. Our construction, however, gives a large 
class of sequences with small autocorrelation for any alphabet with cardinality k, whenever 
k is prime or squarefree. 

The paper is structured as follows. In Section [2] we state the general bounds for the 
discrete correlation in Theorems 12.31 and 12.41 In Section [3] we give the definition of general- 
ized Rudin-Shapiro sequences. Sections H] and [5] are devoted to the combinatorial proofs of 
Theorem 12.31 and 12. 4[ respectively. In Section [6] we give the proof of Theorem 12.61 by using 
the Lovasz local lemma. Finally, in Sections [7] and [8] we give the proofs for Theorems 13.11 
and 13.31 by means of exponential sums. 



2 General bounds for the discrete correlation 

We wish to establish upper bounds for Cr as r gets "large". To begin with, we normalize 
the vector r. For an integer sequence T = {to,ti, . . .) with + ri > for i G N, we define 
shifted versions of Cr, namely, 

Cr,T = liminf 5{n + tM + ri,n -\- tM + r2, . . . ,n + tN + Tm)- 

n<N 

Proposition 2.1. Let r = (ri, r2, . . . , r^) with < ri < r2 < ■ • • < r^, and let T = 

(to, ti, . . .) he a sequence of integers with tj + ri > for all i. Ift^ = o{N), then Cr,T = Cr- 
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Proof. We note that 

^ JV+tjv-l 

C^,T = liminf — S{n + ri,n + r2, . . . ,n + r^). 



N^oo N 

n=tN 



Since 6{n + ri,n + r2, . . . ,n + Vm) G {0, 1} for all n, the above sum differs from the corre- 
sponding sum in (11.31) by at most 2tN- Thus if ^at = o{N), then 



\n<N 



Cr,T = liminf | 7 ^ 5{n + ri, n + r2, . . . , n + r^) + o{N) \ = Cr- □ 

By taking T = (t,t, ...), Proposition 12.11 implies that Cr+ti = Cr for all constants 
t > —Ti. We shall say r is normalized whenever ri = and ri < r2 < ■ ■ ■ < and 
henceforth only consider normalized r. In the m = 2 case, we then have r = (0, r2) and we 
can establish an upper bound by taking the limit as r2 approaches infinity. We shall obtain 
the following result. 

Theorem 2.2. Let x he an infinite word over an alphabet of size k. Then 

liminf^o.^) < 1 - (2-1) 

In the next section we provide the construction of deterministic sequences with equality 
in (12. ip . More precisely, we show that for generalized Rudin-Shapiro sequences {k prime or 
squarefree) we have 

mfJC(o,,)} = l-\. 

To generalize Theorem 12.21 to larger values of m, we must precisely define the notion of "r 
getting large". Let || ■ || be a norm on the finite dimensional vector space M™. We will prove 
the following upper bound on as ||r|| tends to infinity: 

Theorem 2.3. Let x he an infinite word over an alphabet of size k. Then for any m > 2 
and any norm \ \ ■ \ \, we have 

lim (inf {Cr. : r G N"", r normalized, | |r| | > A}) < 1 - . (2.2) 

We note that Theorem 12.21 is immediately implied by Theorem 12.31 by taking m = 2. 
Theorem 12.31 is proven via a combinatorial argument in Section HJ 

In order to also consider the local autocorrelation properties of sequences, we define a 
related quantity. Again, let x be an infinite word over an alphabet of size k. For a given 
vector r and positive integers d, we define 

- V 5{i + ri,i + r2,...,i + rm)] . (2.3) 
a ^ — ' / 

i=n / 

Note that for a random sequence on k symbols, we necessarily have D'^ = for all r and d. 
We will prove that for a given vector r, the value of Cr of an infinite sequence is an upper 
bound for all of the values of D^: 
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Theorem 2.4. Let x be an infinite word over an alphabet of size k, r be normalized and 
d>0. Then < Cr- 

As an immediate consequence of Theorem 12.31 and Theorem \2A\ we obtain an upper 
bound on D'^ as ||r|| tends to infinity. 

Corollary 2.5. Let x be an infinite word over an alphabet of size k. Then for any m > 2, 
d> 0, and norm \ \ ■ \ \, we have 

lim (inf [D'^ : r E N"", r normalized, \\r\ \ > A}) < 1 - . (2.4) 

An interesting example occurs when we choose a fixed d > and take 

r = {0,d,2d,...,{m-l)d). 

Then for each subword wiW2 ■ ■ ■ Wm of x with \wi\ = d for all i, the number of indices j where 
\{wi [j] ■ I < i < m}\ > 1 is at least dD^. In this case, for sufficiently large d, we can get 
arbitrarily close to the bound in (12. 4p . 

Theorem 2.6. For all e > there exist an infinite word x over an alphabet of size k and 
do = do{e) such that for all d > do and r = (0, d, 2d, . . . ,{m — l)d) we have 

Di>l- - e. 



3 Generalized Rudin-Shapiro sequences 

The quantity Cr has been studied for various special sequences. A classical result of 
Mahler [10] states that for the Thue-Morse sequences over k symbols, the summatory cor- 
relation has no uniform leading term. On the contrary, Queffelec [15] noted (referring to an 
unpublished result by Kamae) that the Rudin-Shapiro sequence indeed has the desired lead- 
ing term, whenever r is fixed. As for the hub of the present article, Mauduit and Sarkozy [T31 
Corollary after Theorem 4] showed that for the correlation of order 2 one may let r2 = o{N) 
without losing this property. The following definition gives an extension to alphabets of size 
k>2. 

Definition 3.1. Let g : {0, 1, . . . , A; — 1} x Z — Z, {j,n) ^-^ g{j,n) be a function which 
is periodic in n with period k. Furthermore, let g be such that for all integers u, i with 
< u < u + i < k — 1 we have 

{ {g{u + i,n) — giu, n)) mod k : 0<'ri<fc — 1} = {0, 1,...,A; — 1}. 

Then we call a sequence (a(n))„>o over the alphabet {0, 1, . . . ,k — 1} a generalized Rudin- 
Shapiro sequence if there exists a sequence of integers (a(n))„>o such that d{n) = a{n) (mod 
k) and 

a{nk + j) = a{n) + g{j, n), < j < k — 1, n > 1. (3-1) 
The function g is called an admissible function. 
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Example 1: A "canonical" admissible function g in the sense of Definition 13.11 is 

9ij,n) = j ■ (nmodk), (3.2) 

which is Queffelec's generalization for the ordinary Rudin-Shapiro sequence [151 Section 4]. 
In this case g{u + i,n) — g{u,n) = in (mod k), and {in : < n < — 1} runs for i with 
< i < k — 1 through all residue classes mod k, provided k is prime. In particular, for k = 2 
and 

= I ^' if^' = l'^ = l (mod 2); 
^ ' \ 0, otherwise 

we get the Rudin-Shapiro sequence over the alphabet {0, 1}, namely, 

(a(n))„>o = 0,0,0,1,0,0,1,0,..., 

where the corresponding sequence a{n) counts the number of subblocks (1, 1) in the binary 
expansion of n. 

Example 2: For k = 2 and appropriate initial conditions, we get sequences which count 
any fixed block of size two. For instance, by setting 

(?(1,0) = 1, ^(0,0) = (7(1,1) = ^?(0,1)=0, 

the resulting sequence (a(n))„>o counts (mod 2) the number of subblocks (01) in the binary 
expansion of n. 

Example 3: For A; = 3 an admissible function other than (13.21) is given by 



1, ii j = n (mod 3); 
0, otherwise. 



Here, the resulting sequence (a(n))„>o (with initial conditions a(0) = a(l) = a(2) = 0) 
gives the cumulative number of appearances (mod 3) of subblocks (00), (11) and (22) in the 
ternary expansion of integers. 

The following theorem shows that generalized Rudin-Shapiro sequences resemble the 
discrete autocorrelation behavior of random sequences if m = 2. 

Theorem 3.1. Let 

a(0),a(l),a(2),... 

be a generalized Rudin-Shapiro sequence over {0, 1, . . . , — 1} with k prime. Moreover, let 
< Ti < r2. Then, as N ^ oo, we have 



5{n + ri,n + r2) = (l-^jN + Ok - ri) log — — h r2 j 



(3.3) 



where the implied constant only depends on k. 
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In the proof, we give an explicit value for the implied constant. As an immediate conse- 
quence we note 

Corollary 3.2. In the setting of Theorem \3 . 1\ if r2 = o{N) then 



n<N ^ ^ 



N. 



It seems natural to consider the cross product of two generalized Rudin-Shapiro sequences 
to prime bases to construct an extremal sequence for squarefree k. Let k = pip2 ■ ■ - pd be a 
product of pairwise distinct primes, and put ci = 1, Cj = piP2- ■ -Pi-i for 2 < i < d. We 
define the sequence (d(n))„>o by 

d{n) = a{n) mod fc, (3.4) 

where (a(?T,))„>o is defined by 

a{n) = ciai{n) + C2a2{n) H h Cdad{n). (3.5) 

Herein, {ai{n))n>o satisfies the recursive relation 

aiiPiU + j) = ai{n) + gi{j, n), I < i < d, (3.6) 

for n > 1 and < j < — 1. Again, the functions Qi are admissible functions in the sense 
of Definition 13. II for 1 < i < d. Our next result gives an estimate for the correlation of order 
two. 

Theorem 3.3. Let k = pip2 ■ ■ - pd with d >2 be squarefree and denote by 

a(0),a(l),a(2),... 

a generalized Rudin-Shapiro sequence over {0, 1, . . . , — 1} defined by Ili3.4\ ), 1^3. 5\) and 1^3. 6\) . 
Moreover, let < ri < r2 and < 7 < 1. Then, as N ^ 00, we have 

l-r)N + 0k(ir2- ri)N^-^'^ + (ra - ri)N^-^ log 

kj V r2-ri 



+ Ar^ + rij, (3.7) 

where the implied constant only depends on k. 

Corollary 3.4. In the setting of Theorem \3.3[ if r2 = o{N^^^) then 

6{n + ri,n + r2) ~ (l ^ ^) ^• 

n<N ^ ^ 
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4 Proof of Theorem 



2.3 



We need the following lemma for our proof of Theorem 12.31 

Lemma 4.1. Suppose we have a multiset of n distinct objects of k types, and let d < n be a 
fixed constant. Then among the (^) subsets of d objects, the number containing at least one 
pair of objects of different types is at most 



^ 1 



d\ V k'i-^ J ' 

Proof. Suppose we have hi objects of type i for all 1 < z < fc. Then we have (^') subsets 
consisting entirely of objects of type i. Thus the total number of subsets P that contain at 
least one pair of objects of different types is 

n\ sr^ (bi 



d) ^\d 

i=l 



1 



n{n — 1) ■ ■ ■ [n 



k \ 

d + l)-Y,Ub^-l)■■■{h-d+l)y 
i=i J 



Consider the polynomial = x{x — 1) ■ ■ ■ {x — d + 1) = Cix + • • • + eax'^. We rewrite our 
expression for P in terms of 0, 



i=l i=l i=l / / 



— I (i)[n) - I ei 



By the power means inequality, 



k~k 



i=l \ i=l / 



for all z/ > 1, 



and thus 

k 

n' 



i=l 
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We apply this bound to our expression for P to yield the desired result, 

\ ( . . . ( r? 



P < -j; [ (l>in) - [ein + e2-r- H h 



With our lemma in hand, we now prove Theorem I2.3[ We proceed via contradiction. 
Suppose that for some m > 2 and some norm || ■ || on M™, there exists an e > such that 

lim (inf {Cr : r G N™, r normalized, ||r|| > A}) = 1 — ^ + e. 

We assume without loss of generality that e < . Our limit implies that there is some 
Ao G M such that for all normalized r G with ||r|| > Aq we have 

^TJ^^ ]^ E + ^1' ' + ^2, . . . , ^ + r^) > 1 - ^ + -. (4.1) 

1=0 

We define p(r) = max {rj} — min {rj} to be the range of r and note that p{r) = whenever 
r is normalized. Let r* = (0, . . . , 0, 1) G and let p be an integer such that > Aq. 

Then whenever r is normalized with p(r) > p, we have ||r|| > llpT"*]] = pH'"*!! > Aq. Hence, 
for all normalized r with p{r) > p, we can pick G N by (14.11) such that for all N > Ur, 
we have 

1 1 

-5^5(z + ri,^ + r2,...,z + rJ>l-^ + |. (4.2) 

To construct our counterexample, we ensure that we have selected p such that 

p > m, (4.3) 
and then pick g G N such that the following both hold: 

(a) g > (4.4) 

(b) g™- > ^^(-^ -^)P"'-\ (4.5) 

Since there are finitely many normalized r G N™ with p < p{r) < q, we can then pick an 
G N such that the following both hold: 

(a) n > rir for all normalized r with p < p{r) < q. (4-6) 

18gm! ,^ ^, 

(6) n > — — . (4.7) 
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Now, for any set f/ C N with \U\ = m, there is a unique normahzed vector and 
integer offset fi{U) such that the vector + fi{U)l is an ordering of the elements of U. 
We write S{U) to denote the correlation coefficient associated to this vector, namely S{U) = 

+ fi{U)). We also write p{U) = max ([/) — min (f/) for the 



M 



and fi{U) = min(f/). With these definitions 



5(rf + /i(f/),r2^ + Mf/),--- 
range of U. It follows that p{U) = p{r 
in hand, we consider the following sum, which will be counted in two different ways to achieve 
our contradiction: 

/ \ 



n-l 



5 = E 



a=0 I UQ{a,...,a+q-l} . 
\ \U\=m / 

We first use Lemma 14.11 to bound S from above. The sum 

f/C{a,...,a+g-l} 

\U\=m 

counts the number of subsets of m elements from the multiset [xa,Xa+i, ■ ■ ■ ,Xa+q-i\ that 
contain at least one pair of distinct symbols of the k possible symbols. Thus Lemma 14.11 
applies, yielding 



^ — ^ ml 



a=0 



1 



k 



m— 1 



nq 



ml 



1 - 



k 



m— 1 



(4i 



Next, we will attempt to bound S from below by expressing it in terms of partial sums 
of the form seen in (14. 2p . Our first goal will be to rearrange this sum according to the 
multiplicity of S{U) for each U. Sets U will be subsets of {a, . . . , a + g — 1} for more values 
of a if they have lower range, so we sort the terms according to the value of p{U), yielding 



g— 1 n— 1 
b=m—l a=0 



\ 



UC{a,...,a+q-l} 

\U\=m 
p(U)=b 



J 



For a given U C {0, . . . ,n + q — 2} with \U\ = m, we have U C {a, . . . , a + g — 1} if and 
only if min (U) > a and max (U) < a + q — 1. Thus U C {a, . . . ,a + q — 1} for precisely 
those a with p{U) + p{U) — (g — 1) < a < p{U). However, when we rearrange our sum, we 
must count only those a which also lie in the range {0, . . . , n — 1}. We rewrite our sum as 



9-1 

E 

b=m—l 



\ 



E 



(7C{0,...,n+(j-2} 

\U\=m 
p{U)=b 



min {fi{U) ,n—l} 

E ^(f^) 

. a=max {p{U)+p[U)~{q~l)fi} 
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We drop all terms containing elements less than q or greater than n — 1. All the sets U which 
remain will have //(C/) + p{U) — (g — 1) > and //(C/) < n — 1, such that 



9-1 
b=m—l 



q-1 

E 

b=m—l 



q-1 

= E 

b=m—l 



( 

H{U) 

E m 

UC{q,...,n-l} \a=iJ.{U)+p{U)-{q-l) 
\U\=m 
\ p{U)=h 

( \ 



\ 



E 



;7C{q,...,n-l} 
\U\=m, 
\ piU)=b 

( 



\ 



(q-b) Yl '^(^) 



V 



UC{q,...,n-l} 
\U\=m 
p{U)=b 



We now need to add back some of the terms we dropped and subtract away appropriate 
compensation. We can choose U C {0, ...,n — 1} with \U\ = m, p{U) — h and U ^ 
{g, . . . , n — 1} by picking min ([/) e {0, . . . , g — 1}, taking max {U) — min ([/) + 6, and then 
choosing the remaining m — 2 elements from {min (U) + 1, . . . , min (U) + b — 1}. There are 
^(m-2) '^^ys of doing this. It is convenient to instead use qb"^~'^ as an upper bound for this 
quantity; we then use the fact that 5(t/)e{0,l}to write 



q-1 

E 

b=m—l 



q-1 

> E 

b=m—l 



( 



( 



( 



( 



\ 



E ^(f^) 



;7C{0,...,n-l} 
\U\=m 
\ p{U)=b 



{q-b) ^(^) 



{7C{0,...,n-l} 

\U\=m 
p{U)=b 



\ 



— qb 



m-2 



- q 



m+1 



In a similar manner, we add back more terms so that we may consider all U C {0, . . . , n + 
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g — 1} with \U\ = m and p{U) = b, and subtract off another multiple of q"^+^ to compensate, 



6=m— 1 



{q-b) ^(f^) 



v 



UC{0,...,n+q-l} 
\U\=m 
p{U)=b 



- 2q' 



■m+1 



We now associate each set U to its sorted vector + /i(f/)l and group them according to 
their values. Since we count each subset of {0, . . . ,n + q — 1} having range < g — 1, we 
are certain to include r + il for every normalized r of range < q — 1 and every offset i from 
to n. We drop any other terms and ignore those r with p(r) < p (recalling (14.31) . where 
we ensured that p > m), leaving us with 



b=m—l 



\ 



v 



rGN™ 1=0 
r normalized 
p{r)=b 



-2q 



m+1 



Finally, we may use (14.21) to bound the inner sums from below, since for all r with p{r) > p 
we have n > n^. by (14.61) . We then simply count the number of normalized r vectors of each 
range, obtaining 



q-l 
b=p 



rGN 
r normalized 
p(r)=b 

E 

b=p 



2q 



m+1 



6-1 
,m — 2 



> 



!(i-^+l)i:((^-^)(^--r"')-2^ 



(m-2) 



(4.9) 
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We simplify and evaluate the remaining sum to get 

q-l q-l 



m—1 



m— 1 



{{q -h){h- m)'"-2) > ^ ((g + m - - m)'"-^) - mq 

b=p b=p 

q+m 

> Y [{q + m - b){b - m)™"^) - 2mq 

b=p 

q 

b=p—m 

q 

> ^ ((g - - 2mg"^-^ - qp""-^ 

b=0 

q 9-1 

6=0 6=0 

>q Ph'^-^db- r b""-^ db - 2mq'^-^ - qp""-^ 
Jo Jo 



m{m — 1) 

We substitute this back into fl4.9p to obtain 

rig^ / 1_ £\_ -m+i _ 2mnq"'-^ _ nqp"^-^ 

m! V k^-' s) ^ (m-2)! {m - 2)\- ^ ^' 

What remains is to eliminate the three leftover terms on the right hand side with the 
bounds we used when selecting q and n. First, by 



m! / \9/ (m — 2) 
Second, by fl4.5p . we also picked q such that 



nq"" \ /e\ nqp^ ^ 



m\ J V9/ (m-2)!' 
Third, by (14. 7p . we picked n such that 



n >7f^- (4-12) 



) (I) > (-3) 



Adding fHU]) . iKV3i . and fHT^ together, we get 



m\ J V3/ (m-2)! (m-2)! 

12 



and we substitute this into fl4.10p to obtain 



S>^ 1 ' 



ml V k"^'^ 

which contradicts (14 .SI) , proving the desired result. □ 



5 Proof of Theorem 12.4 



Suppose, for our sequence, that there exists some m > 2, r G N™, and d > such that 
D^> Cr- Let e = D'^-Cr and pick p such that 

2dDi 
p > -. 

e 

Then by our definition of Cr-, there is some n > p such that 

n-l 



n 

j=0 



Dividing n by d, we let n = ad + h^ where a and h are non-negative integers and b < d. Then 
rearranging our expression and applying the definition of yields: 



n-l 



n ^ — ^ 2 

i=0 



^ / a— 1 id+d—1 ad+b—1 

= - I 5Z 5Z + ri,...,j + rm)+ ^ S{j + ri,. . . ,j + r„ 

\ i=0 j=id i=ad 

/a-1 ar+b-l \ 

\ i=0 i=ar / 

^ adDt e 



e 



n 2 
dDi e 



n 2 
However, since 

we then have 



2dDi 

n > p > 



e 



dPj e_ 
n ^ 2' 

and substituting this into the above yields 

Thus we have a contradiction, and so we have < Cr for all r and d. □ 
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6 Proof of Theorem 



2.6 



It is sufficient to show that for all integers k,m >2 and all real numbers e > 0, there exist 
an integer do and an infinite word x = XqXiX2 ■ ■ ■ over a fc-letter alphabet such that for every 
integer d > do and i > there are at least (1 — -^^^^ — e) positions where the m words 

do not all agree. We use the Lovasz local lemma to show the existence of finite words of 
every sufficiently long length satisfying the condition. The existence of an infinite word then 
follows from the usual compactness argument. 

Here is the statement of the Lovasz local lemma, as taken from [21 Chap. 5]. 

Lemma 6.1. Let Ai, ^2, . . . , At he events in a probability space, with a dependency digraph 
D = {S, E). Suppose there exist real numbers ui, U2, ■ ■ ■ ,ut with < Ui < 1 for 1 < i < T 
such that 

Pr(A,) H (l-u,) (6.1) 

for I < i < T . Then the probability that none of the events Ai, A2, . . . , At occur is > 
ni<i<r(l 

Let Ai^d denote the event that there are < t positions where the m words 

' ' ' -^i+d ' ' ' •^i+2d—l: • • • ; •^i+{m—l)d ' ' ' -^i+md—l 

do not all agree. Moreover, let S be the space of all such events Ai^d and (5, E) the de- 
pendency digraph specifying when one event is dependent on another, which corresponds to 
overlapping ranges of the word being constructed. 

To evaluate Pr[Aj_d] it suffices to count the number of such strings. First, we choose 
the values for the symbols of the first string, Xi, . . . , Xi+d-i, which can be done in k'^ ways. 
Next, we choose the precise number of positions j in which the m strings will fail to agree, 
and the positions themselves. This can be done in ^Q<j<j (^) ways. For each such position, 
there are A;™^^ — 1 ways to choose the symbols of the remaining m — 1 strings in such a way 
that they do not universally agree with the first string. The remaining symbols in the last 
m — 1 strings are now completely determined, as they must agree with the symbols in the 
corresponding position in the first string. The total number of such strings is therefore 

o<j<t ^-^^ 

We therefore find 



Pr[A,,,] = 




To estimate this sum we use the following classical estimate on the tail of the binomial 
distribution, which is a version of Hoeffding's inequality [6] : 
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Lemma 6.2. Suppose < p < 1, and let t, d be positive integers with t < dp. Then 

-2(dp-t)^/d 



o<j<t ^-^^ 



If we now take t = {1 — j-^^ — e)d, p — ^ , we obtain 



Now fix n, the length of the string. We want none of the events Aj^g for do < s < n/m, 



< j < n — ms, to take place. Choose Uj^g = e "^^^ . Then 

n (1-%^) = n 

is+l<j <z+mi 
0<j<n— ms 
dQ<.s<Ti/ m 



s>do 



md+ms— 1 



Taking logarithms, we get 



log(l - Uj^s) > ^{md + ms-l) log(l - Uj^s). 

((i,d),0»)eE s>do 

Provided Uj^s is sufficiently small, we can bound \og{l—Uj^s) with —cuj^g for some constant 
c. Hence we get 

{md + ms — 1) log(l — Uj^g) 



s>do 



> —{md + ms — l)ce 2^ 



s>do 



= — (md — l)c^e 2^^* — mc^se 2' 



s>dQ s>do 

= —{md — l)c — mc , 

Now choose do large enough so that 



62" — 1 2mc' 

and also large enough so that 

g-i.^(do-i)(l - dp) + doe-^"'('^°-^) ^ £2^0 
(e5^^ - 1)2 ~ 2mc' 
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It follows that 



log I Ui^d Yl 



Uj^si I ^ ——ea—ima — Lic- mc 



> 


2 


{md — 


l)c 

2mc 


> 


-lA- 




- ^e'^do 


> 








> 


-2e'^d 







e^do 
2mc 



> logPr[A 



i4\ 



as desired. Hence, by the Lovasz local lemma, it follows that the probability that none of 
the events Aj^g occur is > n((id) {j s))€e('^ ~ "^i.s) ^ 0, and hence such a string of length n 
exists. □ 



7 Proof of Theorem 3.1 



Before turning to the proof of Theorem 13.11 we need one auxiliary tool. We rewrite the 
left- hand-side expression of (13. 3 p in terms of exponential sums. As usual, set e{z) = e^'^^^ for 

Proposition 7.1. For any infinite word XqXiX2 ■ ■ ■ over {0, 1, . . . , A; — 1} we have 



5^5(n + r„n + r2) = iv('l-iVi 5^ 

n<N ^ ^ l<h<k n<N ^ 



Proof. The proof is based on the relation 



0<h<k 



First, since x„ G {0,1,2, ... ,k — 1} we notice that k \ {Xn+r2 — ^n+n) if and only if 



-r2 



Xn+n- Therefore, 



^S{n + ri,n + r2) = N ^ ^ e Q (x„+^2 - x„+^J j 



n<N n<N 0<h<k 

n<N l<h<k 
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In view of Theorem 13.11 and Proposition 12.11 it suffices to show that for all 1 < h < k — 1 
we have 

J2 e (|(«('^ + ^)- «(^))) = Ok (^r log (^^^ + , (7.2) 



n<N 



where the imphed constant only depends on k. Since e{z + 1) = e{z), the left-hand-side sum 
in (17.21) can be rewritten in the form 



J2^(^ia{n + r)-a{n))\. (7.3) 

n<N ^ ^ 

In the sequel we will need the generalized quantities 



n<N 



7iv(r, /) = 5^ e (^(a(n + r) - a(n))) e , (7.4) 

n<N ^ / V / 

where / : N — Z is an arbitrary periodic function with period k. We ffist show that for 
all such / we have 7Ar(l, /) = Ofc(log A^) for N > k. We will then use induction on r to 
prove (17. 2p . which in turn proves Theorem 13.11 

We follow the reasoning of Mauduit [Hj. Regarding (17.41) we split the summation over 
n < up according to the residue class of n modulo k. We obtain 

7.iv+,(l,/)= Yl e(^(a(n + l)-a(n)))e(^') 

n<kN+j ^ / V / 

= J2 e (^Hkn + i + I) - a{kn + t))] e 

1=0 kn+i<kN+j ^ ^ ^ 



Thus, 



n=0 ^ / \ / 

+ Ee r^(a(fcAr + M + l) -a(A;Ar + M))'j e ("^^ 

n=0 ^ / V ' 

+ 5Z e (^ia{kn + u + 1) - a{kn + u)) 



(7.5) 

(7.6) 

(7.7) 
(7.8) 



The sums (17.51) and (17. 6p are trivially bounded hj k + j < 2k — 1. Concerning ( 17. 7p we note 
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that for 0<'U<A; — 2we have 



l<n<Af 



e I ^ {a{kn + u + 1) — a{kn + u))\ 

5^ ^ (al?^) + 9{u + 1, n) - a{n) - g{u, n))\ 

Cn<N ^ ^ 



l<n<Af 



By our assumption g{u + l,n) — g{u, n) runs through a complete residue system mod k for 
1 < 72 < A;, so this sum is bounded in modulus by k/2. Therefore, (17.71) is bounded by 
k{k — l)/2. Finally, we rewrite the sum in (17. 8p in the form 

e f ^ {a{kn + k) — a{kn + k — 1)) j 

l<n<N ^ ^ 

= 5Z ef|(a(n + l) + (7(0,n + l)-a(n)-(?(A;-l,n))j 

where f{n) = g{0, n+1) — g{k — 1, n) is again periodic with period k in n. Summing up, we 
get 

|7fciv+,(l, /)| < |7^(1, /)l + I + 3). (7.9) 

From (17. 9p and |7ra(l, /)|<A; — lforl<r2<A; — 1 and all / we get by induction that for 
all /c-periodic functions / and all N > k, 

hNilJ)\<^^^r^hgN + k-l. (7.10) 
ziogk 

For our induction on r to work, we need one more initial value, namely 



7.(0./)^Ee(^) 

n<N ^ ^ 



which satisfies 



l77v(0, /)| < |, if /({0, 1, . . . , A; - 1}) = {0, 1, . . . , - 1}. (7.11) 
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Now, let us consider the general case with r = kM + i > where M > and < i < k~l 
but (M,i) ^ (0,0). Similarly to (173])-([7SD we have 

jkN+j{kM + i,f) = 

k-2 



J2^fhf^\ e (^{aikn + u + kM + i) - a{kn + u))] (7.12) 

u=0 ^ ^ l<n<7V ^ ^ 

V ^ ^ 1<„<7V / 

+ 0(1), 

where the implied constant is bounded in modulus by 2k— 1. We again need a close inspection 
of the two infinite sums fl7.12p and (17.131) . First, suppose i ^ 0. We rewrite the sum (17.121) 
in the form 

Y^^fhf^\ J2 e(^^{a{n + M)+g{u + t,n + M) 

u=0 ^ ^ l<n<Af 

-a(n) -g{u,n))^ 
+ ^(~^) 5Z e(^^{a{n + M + l)+g{u + i-k,n + M+l) 



/i/i(n) 



=fc-i ^ ^ l<n<Af 

- a(n) - g{u,n)) 
EV^) E e(|(a(n + M)-a(„)))e( 

E e (^(a(„ + M+1) -„(„))) e( 



/i/2(n) 



=k-i ^ ' l<n<N 

where 



fi{n) = g{u + i,n + M) — g{u, n), for < u < A; — 1 — i, 

/2(n) = g{u + i — k,n + M + 1) — g{u, n), for k — i<u<k — 2. 



Using (I7.4p this yields 



5Z e r^(a(A;n + u + A;M + z) -a(A;n + M))'j 



(7.14) 



u=0 ^ ^ l<n<N 

k-l~i /, \\ 



E » (^) A) + E ^ (^) 7«(M + 1, /.) + 0(1), 

M=0 ^ ^ u=k~i ^ ^ 
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where the 0(l)-term comes from including n = into (17.141) and therefore is trivially bounded 
in modulus hj {k — i) + {i — 1) = k — 1. Consider the second sum fl7.13p and let i 7^ 0. Then 

a{k{n + M + 1) + z - 1) - a{kn + k - 1) 

= a{n + M + 1) - a{n) + g{i - l,n + M + 1) - g{k - 1, n). 



Therefore, 



(^) E 



e ( ^{a{kn + k-l + kM + i)- a{kn + k - I)) 



l<n<N 

< |7^(M + 1,/3)| + 1, (7.15) 

where fsin) = g{i - l,n + M + 1) - g{k - 1, n). Now, from (17J2I) . (l7J3ll . (EUD and (TTTHl) 
we see that 

\jkN+j{kM + z,f)\< |7iv(M, /i) I ■ (A; - ^) + |7iv(M + 1, f^) \ ■ {^ - 1) 

+ |7;v(M + 1, /3)| + 1 + (2fc - 1) + (A; - 1). (7.16) 

Plugging in M = 0, using (l7.1Up and (17.111) and observing that fi{n) = g{u + i,n) — g{u, n) 
permutes {0, 1, . . . , — 1} by assumption, we get 

bkNA^. f )\ < ^^^~^^^t^^^ logiV + ^ (2A: + 3), 1 < ^ < A; - 1. 
2 log K 2 

This implies that for 1 < i < A; — 1 and all functions / with period k we have 

M^,/)l<^^^^^^^^ logf^Vl(2^ + 3), N>k. (7.17) 



2 log A; V ^ / 2 

On the other hand, ifO<M<A; — 1 then 

a{k{n + M) + u) — a{kn + u) = a{n + M) — a{n) + g{u, n + M) — g{u, n) 

so by joining (I7.12p and (I7.13P in case that i = we directly get 



k-l 



|7fc7v+,(A;M, f)\<Y, (l7iv(M, f^)\ + 1) + (2A; - 1), (7.18) 



M=0 



where f^{n) = g{u, n + M) — g{u, n). Therefore, by (I7.10p and (I7.18P applied for M = 1 we 

get 

provided N > k. Therefore, for all N > k, 

hN{^J)\<^^^log(^]+e + 2k-l, (7.20) 



2 log k \k 
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for the whole range 1 < i < k. We now start our induction on the parameter r = kM + i. We 
iterate (17.161) and (17.181) with (17.201) as an initial value to obtain for r = k^ + l,k^ + 2, . . . , 
with s > and for all > 

- (^) + + 2A: - 1) + X:(3A: - l)k^ 



2\ogk ^\k'+^J k-1 

This finishes the proof of Theorem 13. 1[ □ 



8 Proof of Theorem 3.3 



For the proof of Theorem 13.31 it suffices to show that for all 1 < h < k — 1 and < 7 < 1 we 
have 



n<N 



h 



[a[n + r) 



a(n))^ <^N^ + rN^-^l'^ + rN^-^' log 



where the implied constant only depends on k. We follow Kim [9| Section 4], however suitably 
modifying the argument to deal with the function a not being /c-additive in the usual sense. 
We need some more notation. Let h = (61, 62, • • • , ^d) and set 

Pfo = {n G N : n = hi mod pl\ I <i <d}, 

where Sj is the unique integer with p^' < N'^^'^ < p*'^^. Since the pj's denote different primes 
by assumption, we have 



N 



+ 0(1). 



Further set 



B = {{h,h,...,b,): 
Bo = {(61, &2, ■■■,bd) : 



0<bi<p-' for 1 < i < d}, 
< 6j < - r for 1 < i < d}. 



Now, consider n = Uip^" + 6j where < hi < p^^ — r. We may assume that > 1, which 
is true for most n, i.e. N^/'^ < n < N (the error term of N^/'^ is negligible in the final 
estimate). Write 



where P^, Pl G {0,1, 



b, = (3,^_,pr' + Ps^-2Pr' + ■ ■ ■ + 

1} for < z/ < Sj. Furthermore, set 

:max(j: P'^ ^ 0, < j < - 1), 
- max(j : /?j 7^ 0, < j < Si - 1), 



w. 
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which indicate the uppermost nonzero coefficients in the expansions. Then by (13. 6p we can 
rewrite aj(n + r) — aj(n) in the form 

a* {niPT + f3s,-iPT~^ + ■■■ + -ai {riip-^ + (3s,-ipt''^ + ■ ■ ■ + Po) 

Si-2 

= aiirii) + g0,^_i, rii) + ^ gi{(3l, (3l^^) 

Si-2 



- ai{ni) + gi{f3s,-i,ni) + ^ gi{(3^y, 



u=Q 

Si~2 



giiP's^-1, rii) - gii/3s,-i,ni) + {giiPl, Pl+i) - giiPu, Pu+i)) 



u=0 



aiipi + r) - ttiihi) + r, rii), 



where 



r, rii) = gii/3's^-v ^i) - giiPs,-uni) + ^ gi{Pl, (31^^) - ^ gi{(3^, 
Consequently, 

e (^{a{n + r)-a{n))j = YIl^ (^Ci{ai{n + r)-ai{n))j 

n<N ^ ^ n<N i=l ^ ^ 

= X] X] (^'^* ^^"-^^^ + r) - aiipi) + iJ,i{bi,r,ni))j 

beBo n<N i=l ^ ^ 

beB\Bo n<N ^ ^ 
nePt 

which equals 

^ JJe f %i (ai(6i + r) - aj(6j)) j ^ JJ e Cj/ii(6j, r, J (8.2) 



+ 5^ 5^ (e(^(a(n + r)-a(n))j 



- JJe Qci(ai(6i + r) -0^(6^) + /ii(&i, r, rii))^^ . (8.3) 
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The second sum (18.31) is trivially bounded by (we follow [9]^ 



r -i-r \ / N 



2\B\Bo\-Mn<N:neP,}« E^H^? + ^^^^ 

\i=l j=l / \[U=lPi , 

< rA^i-T/-^, (8.4) 

which is one of the error terms in the estimate. Now, consider the first sum (18. 2p . Let 

= {h^B: Vi = Wi and (3^^ = (3'^^ for all 1 < i < d}. 

Obviously, for every b E we have /ii(6i, r, rii) = for all n < iV, n G Pt,. We use a similar 
splitting as above, such that (18. 2p satisfies 

b£l3 i=l ^ ^ n<N 

n£Pb 

Our next task is to establish a bound for \B \ B''"\. Let pi' < r < pl'~^^. We have to count 
the number of 6j's with < bi < pi' such that performing the addition bi + r gives rise to a 
carry propagation which is transported to the digits (3^^ of 6j, thus giving a contribution to 
lii{bi,r,ni). A necessary condition for this effect is that 

Pu+l = Pu+2 = ■ ■ ■ = Psi-2 =Pi — '^- 

Hence 



\B\E-\<Y^{j^^' + {s,-l-t^j^^') 

i=l 

< > \r + pir \ — log r 

< r log iV^/'^. 
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Summing up, we obtain 



e (^{a{n + r)-a{n))j = 

n<N ^ ^ 

5Z n ^ (a,(6. + r) - a,(6,))') J] 1 + O (riV^-^/'^ + riV^"^ logiV^/'^) 

b£l3 1=1 ^ ^ n<Ar 



n&Pb 



i=l bi=0 ^ ^ V-l--l-«=i^« / 

= AtJ] — ^e(-Q(a,(6, + r)-a,(6,)) +0(iV^ + riVi-^/'^). 

j=l ^« bi=0 ^ ^ 

Finally, we show how to obtain the saving in the exponent, which again finishes the proof 
of Theorem 13.31 Since q = pip2 ■ ■ -Pi-i, we see that for every h there exists an index / with 
1 < I < d and 

h hpip2 ■ ■ - Pi-i h' 

J ci = = — , 

k PiP2---Pd Pi 

with gcd{h',pi) = 1. Applying Theorem 13.11 with k = pi and estimating the other factors 
trivially, we get 

(h \ Af^/'^ 

- {a{n + r) - a{n)) j < N^'^r log + N^~^r + iV^ + riV^-^/^ 

which gives the statement of the theorem. □ 
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